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An upper bound on the error probability is obtained for digital communi- 
cation (with average power P and no bandwidth constraint) in the presence 
of additive white gaussian noise (with one-sided spectral density N e ) with 
the use of a noiseless feedback link. A repeat-request strategy is used: the 
receiver decodes a signal only when it is relatively sure that one particular 
message was actually transmitted, otherwise it requests (via the feedback 
channel) a retransmission. We show that as the coding delay T becomes 
large, we can transmit at an effective rate R < C = P,/N , the channel 
capacity, with error probability P e approximately exp (-T[(vC — 
Vjf) 2 + C — R]\, which is a considerable improvement over the reliability 
attainable with a one-way channel. These residts parallel those obtained 
earlier by Forney for the discrete memoryless channel. 

I. INTRODUCTION 

In a recent paper, Forney studied a repeat-request strategy for 
communication of digital information over a discrete memoryless 
channel when a fedback channel is available. 3 In this system the 
receiver decodes a received message only when it is relatively "sure" 
that one particular message was actually transmitted. If the receiver 
is not confident that one particular message was actually transmitted, 
then it requests (via the feedback channel) that the transmitter repeat 
the message. Forney showed that considerable improvement in the 
resulting error probability (over the best one-way scheme) was ob- 
tainable with a negligible degradation in the effective rate of trans- 
mission. In this paper we apply Forney's ideas to the additive white 
Gaussian noise channel (with no bandwidth constraint) and obtain 
analogous results. Furthermore, our coding scheme is constructive — 
the codes being orthogonal codes. 
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We will consider the following channel. The channel input signal 
is a real-valued function s(t), defined on the interval [0, T], which 
satisfies the "energy" constraint 



/ 



s 2 (t) dt = P T. (1) 



The average signal "power" is therefore P . The channel output r(t) 
is the sum of s(t) and a sample n(t) from a white Gaussian noise process 
with one-sided spectral density N (and with mean zero). By expanding 
s(t), r(t) and n(t) on any orthonormal basis of £ 2 [0, T], it is easy to 
show that an equivalent channel model is as follows. 8,9 (This equivalent 
channel model is the one we use in this paper.) The input signals are 
are (semi-infinite) vectors x = (x t , x 2 , ■ ■ ■) which satisfy 

Z xl - AT. (2) 

The channel output is a vector y = (y t , y 2 , • • • ), where 
y k = x k + z k , k = 1, 2, • • • , 

and the z k (k = 1, 2, • ■ ■ ) are independent Gaussian variates with zero mean 
and unit variance. The parameter A is equal to 2P„/N , and we as- 
sume that A is held fixed throughout the paper. We also assume that it 
takes T seconds for the channel to process x, and that successive T- 
second transmissions are independent. 

A code with parameters M and T is a set of M signals (called "code 
vectors" or "code words") x,- = (x it , x i2 ,••■), i • = 1, 2, ■ • • , M, 
which satisfy equation (2), that is 

Y,x 2 ik = AT, 1-1,2,.--,*. (3) 

We assume that each of the M code words is equally likely to be trans- 
mitted, so that the transmission rate is R = 1/T In M nats (natural 
units) per second, and M = e RT . It is the task of the receiver to examine 
the channel output y and to announce the code word, say D(y), which 
it believes was actually transmitted. Let P ei be the probability that 
D(y) y£ x, given that x f is transmitted. The overall error probability 
is therefore 

1 u 
P = — TP 

It is easy to show that for a given code, the "optimal" decoding rule D 
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(which minimizes P.) selects for D(y) that code word x,- which maximizes 
(with respect to ?*) the inner product 

to 

<x.- , y> = S *a0k 

k-\ 

Define P*(M, T) as the smallest attainable error probability P. for 
a code with parameters M and T. Set M = [e RT ], and let T -* oo with 
the rate R held fixed. Then it is well known that if R < A/2 = 
PJN B = C, the "channel capacity," 

P*.([e RT ] , T) = exp | -E (R)T[1 + e Q (T)] | , (4) 

where E Q (R) > 0, and e (T) -^OasT^co .»■■■" Thus at rates fl < C, 
the error probability tends to zero exponentially in T. Further, for rates 
R > C, P*([e RT ], T) — > 1, so that the capacity C is the supremum of 
the rates for which "error-free" coding is possible. 

Although this type of behavior of P* is typical of a large class of 
channels, the present channel is unique in two ways. First the exponent 
E (R) is known exactly, namely 



E (R) = 



C/2-R, £R ^ C/4, (5) 

[C h - B*] a , C/4 ^ R ^ C. 

Second, an explicit construction of codes which achieve error probability 
as in equation (4) is known. In fact, P e as in equations (4) and (5) 
can be achieved when the code is any set of M orthogonal vectors. 
The simplest such code is that for which x ik (the /cth coordinate of x<) 
is given by 



x ik = 



(AT)'', k = i, 



i = 1,2, ••• ,M, fc= 1,2, ••• . (6) 

10, h J* i, 

For this orthogonal code, the inner product of y and the tth code word is 
(x,-,y> = Vi{AT)\ i=* 1,2, ••• ,M; 

so that the optimal decoding rule is 

D(y) = x,- if yi > Vi for all ;' ^ i, 1 ^ j ^ M. (7) 

With probability one, (7) is satisfied for exactly one *. Notice that 
the coordinates t/,(; > M) are irrelevant to the receiver. Further, 
from the symmetry of the orthogonal code (6), we can without loss 
of generality, assume that code word x, is transmitted. Hence, the 
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error probability is 

P< =P„ = Pr Ol!/i £Vf). (8) 

where the probability is computed with [y f } f independent unit variance 
Gaussian random variables with Ey x = (AT)* and Ey f = (2 ^ j £ M). 
Now suppose we can use a noiseless feedback link. As before, we 
transmit one of a set of M = e RT orthogonal signals {x,-}f, where x< is 
given by (6). Instead of the decoding rule (7), let us use the rule 

D(y) = Xi if y, > y, ; + A for all ; 9* i, 1 ^ ; ^ iW", (9) 

where A > will be chosen later. If no ?/,- satisfies (9) then we 
request a retransmission via the feedback channel, and use (9) on 
the second received vector, and so on. The probability of error de- 
creases as A increases. The price which we pay for this increased reli- 
ability is an increase in the length of time which it will take to complete 
the transmission of the M-ary message, and the consequential reduction 
in the effective rate of transmission. In fact, let E R be the event that we 
ask for a retransmission, and let P(E R ) be its probability. Then from 
the assumption that successive transmissions are independent, the 
expected number of T-second transmissions required to accept a 
message is 

^ j Pr [j transmissions are required} 

= Z ;[i - P(ff«)][PCB*>r i = [i - W] t, iPCW" 

= [1 - P(E R )] 



[1 - P(E R )] 2 1 - P(E R ) 



Thus the average length of time required to transmit the M-ary message 
is f = 37(1 - P(E R )). If P(E R ) is small, then T is not much greater 
than T. 

Suppose that we use this repeat-request strategy repeatedly — that is, 
if the receiver does not call for a retransmission, then the transmitter 
sends a new ilf-ary message. For k = 1, 2, • • • , let the random variable 
N k be the number of ikf-ary messages which the receiver accepts (that is, 
it does not call for a retransmission) in kT seconds. Then we can write 
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where the random variables {j = 1 if the receiver accepts a message 
on the jth T"-second interval, and £,- = otherwise. Note that Pr{£j = 
0} = P(E R ), and that the {&}i*, are independent (since we have as- 
sumed that successive jP-second transmissions are independent) . Thus 

(i) E(N k ) = fcgft,) = fc(l - P(E R )) 
(it) N k /k -> 1 - P(E S ), as k -> co , (10) 

with probability 1. 

Statement (it) follows from the strong law of large numbers (see Ref. 
3, p. 190). Since each M-ary message contains In M = RT nats, the 
effective rate of transmission R, in the light of (10), 

R = L v rjj? — nats/sec 

K1 (11) 

= R[l - P(E R )] =R(T/T). 

Let us turn our attention to the probability of error. Since we are 
using the orthogonal code of (6), we can, as above, without loss of 
generality, assume that code word x, is transmitted. Using the decod- 
ing rule of equation (1.9) we make an error only when for some j > 1, 
Vi > V* + A for alii = 1,2, • • • ,M and i ^ j. (In this case D(y) = x y .) 
Thus the error probability is 

P. = PrOn \Vi > !/• + A|. (12) 

J -2 iVi 

As in (8), the probability in equation (12) is computed with Ey x = 
(AT)* and Ey, = (2 ^ j ^ M). 

Let us further define Ex as the event that either an error occurs or a 
repeat-request occurs. If x x is transmitted, Ex has probability 

ft (BO = Pr OfVi ^ Vi + A >> ( 13 ) 

J -2 

where as above, the probability in (13) is computed with Ey x = (AT)* 
and Ey, = 0, j > 1. Clearly the probability of a repeat-request is 

P(E R ) = P(E>) - P. £ P(E,). (14) 

Consider the parameter A. In the interest of minimizing P., we want 
to make A large. However, in the interest of minimizing P(E R ) and 
therefore making R as close to R as possible, we want to make A small. 
The approach which we will take is to choose A just small enough so 
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that as the parameter T -* «> (R is held fixed), P{E t ) -> 0;_so that 
by (14), P(E R ) -* 0. Thus the effective transmission rate R tt R. 
We will see that this results in a considerable improvement in P. 
over that of equations (4) and (5). Roughly speaking, we will show 
that the resulting exponent is increased from that in equation (5) to 
approximately 

E,(R) = [C* - R*] 2 + C - R = 2C»(C i - R>). (15) 

The exponents E (R) and E F (R) are plotted in Fig. 1. Notice that the 
improvement is greatest in the neighborhood of capacity where (as 
R -* C)E P (R) tt(C - R) and E (R) - (C - R) 2 /4C. 



II. SUMMARY AND DISCUSSION OF RESULTS 

The main result is given as a corollary to the following two theorems 
which provide information on the trade-off between P. and P(E X ) as 
A is varied. The proofs are given in Section III. 

Theorem 1: Let {y,}", be independent Gaussian random variables with 
unit variance and expectation 



E Vl = (AD*, 

Ey, = 0, 2gj|M. 



(16) 
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Fig. 1 — Exponents for white Gaussian noise channel: E (R)-one way exponent, 
EAR) -repeat-request exponent. 
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Let M = e RT , where < R < A/2 = C, and let A = «(2T)* f where 
C* - (4tf) J ^ 5 < C» - ff*. (17) 

Then 

P(*\) = Pr (2/1 ^ ft + A) fS 2 exp { -[C* - fl* - «] a T| . (18) 

Notice that 5 = will satisfy (17) if R £ C/4. In this case P(tf ,) = P. 
(see (8)), and (18) yields E (R) ^ [C* - R*]\C/4 ^ R £ C), a fact 
which is contained in (5). In fact, the proof of Theorem 1 closely 
parallels the derivation of P. for orthogonal codes (for a one-way 
channel). 

Theorem 2: Let {y,}", be independent gaussian random variables with 
unit variance and expectation 

Ey % = (AT)*, (19) 

Ey t = 0, 2 ^ j ^ M. 
Let M = e RT , where ^ R < A/2 = C, and let A = «(2T)», where 

8 > q\ _ (4/2)*. (20) 

IPtft R and 8 held fixed, and 0, , 6 2 arbitrary but satisfying 

0, > 0, (21a) 

fa* (21b) 

5 - [Cj - (4fi)»] 



< 02 < 



</ien /or T sufficiently large, 

P. = Pr H lV# > ft + A ! , 00 , 

1-2 ir*i \££) 

< 2(1 + $d ex P [-[(«* + s - e 2 y + (c* - r" + 2 ) 2 - iqr}. 

Again notice that 8 = will satisfy (20) if R > C/4. In this case 
also, (22) yields E (R) ^ [C* - R h ] 2 , when R > C/4 (since 2 can 
be made arbitrarily small). 

Let us now use these theorems to find the value of A = 5(2 T) which 
gives the smallest upper bound on P. without substantially changing 
the effective rate R ^ R[l - P(Ei)l Since P. is a decreasing function 
of 5, we choose 8 as large as possible with the proviso that P(E t ) -* 0. 
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From Theorem 1, this value of 5 is 

8 = C h - R* - 7l , (23) 

where 7l > 0. If 7l is sufficiently small, this choice of 8 satisfies (17) 
and (20) . With 5 so chosen, for any y 2 > we can find a T sufficiently 
large so that R ^ R(l — y 2 ). Further, substitution of equation (23) 
into equation (22) yields an exponent 

-[C* - 7 . - e 2 ) a + (C* - R* + d 2 ) 2 - R]T. 

Finally, since 7l , y 2 , 0i and 2 can be made arbitrarily small we have 
our main result: 

Corollary: Let di > 0, e > be arbitrary. Let R < C. Then jor T 
sufficiently large, there is a repeat-request communication system using 
orthogonal codes with an effective rate of R and error probability 

P. ^ 2(1 + 0.) exp {-[(C* - i? § ) 2 + C - R - e]T). 

Let us turn our attention to (4) and (5) which give the error prob- 
ability for the one-way Gaussian channel. The fact that E (R) ^ 
(C* — R*) 2 can be demonstrated by a "sphere-packing" argument. 9 
This argument states that P*(M, T) ^ Q, where Q is the probability 
of error which would result if it were possible to subdivide Euclidean 
M -space into M congruent cones (each with apex at the origin), one 
for each code word, and each code word were placed on the axis of its 
cone at a distance (AT)^ from the origin. Setting the "sphere-packing 
exponent" 

E SP {R) = (C h - R>) 2 , 

we have from the above corollary that for effective transmission rates 
R < C we can obtain an error exponent arbitrarily close to 

E F (R) = E SP (R) + C - R. (24) 

For discrete memoryless channels it is possible to find a lower bound 
to the optimal (one-way) error probability using an analogous sphere- 
packing argument. 7 Forney showed that using a repeat-request strategy 
similar to the one used here, one can obtain an error exponent arbitrarily 
close to that of equation (24) [with the appropriate E SP (R)]. 3 Forney 
also studied the so called (discrete) "very noisy channel," which is 
closely related to our Gaussian channel* and obtained results similar 



* Our Gaussian channel may be thought of as a "very noisy channel" since 
the signal-to-noise ratio per coordinate is zero. 
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to our results. Thus, in the light of Forney's results, the above corol- 
lary is not surprising. 

Let us also remark that Kramer has found a scheme for our white 
noise channel with a feedback link that attains an error exponent of 
C - R, which is less than that in equation (24) .* In Kramer's scheme, 
the receiver observes the signal until it is sufficiently confident that 
one particular message was actually transmitted. It then informs the 
transmitter, via the feedback channel, to start the next Af-ary trans- 
mission, thereby using the feedback channel only once per M-ary 
message. In the repeat-request scheme studied here, the number of 
uses of the feedback channel per M-ary transmission is an unbounded 
random variable. Thus the two schemes, while similar (in that the 
feedback channel is used only to convey a "decision") , are not di- 
rectly comparable. On the other hand, there are schemes which use 
the feedback channel considerably more heavily (so called "informa- 
tion feedback") which in some cases attain somewhat better per- 
formance than the repeat-request strategy. (See for example Refs. 5, 
6, and 10). 

Finally, an important problem which has been completely ignored 
here is the requirement that the transmitter have a buffer in which it 
can store data which will accumulate at the transmitter at times when 
the receiver asks for retransmissions. If the buffer has finite capacity, 
it will occasionally overflow, introducing a further source of errors. 
Some quantitative results on this problem have been obtained by the 
author, and will be reported in a future paper. 

III. PROOFS OF THEOREMS 

We begin with some definitions. Let 

0(oc) = 7^-tj exp {-a/2), - co < a < « , 
be the standard Gaussian density, and let 

$( w ) — / g( a ) da, ~" °° < u < co , 

J — 00 

be the cumulative error function, and let 

3\.(w) = / g(a) da = 1 — $(«), -co < u < cc , 

be the complementary error function. Let b = (AT 1 )' = (2CT)* so 
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that y x has density g(a — b) and y f (2 ^ ;' ^ M) has density g(a). 
We will use the following 

Lemma 1: For u ^ 0, <£> c (w) ^ exp (— w 2 /2); and for u ^ 0, $ e (u) ^ 
exp (— w 2 /2) (Wozencraft and Jacobs Ref. 8): 

Proof: For w ^ 0, 

[$»] 2 = jT £" ff(«)^03) da df* £ £ / 0(«)rf0) da d/8 = GXP ^ , 

where (R = { (a, 0) : a + /3 2 ^ 2u 2 , a ^ 0, jS ^ 0}. Taking square roots, 
we have 

_ , . . exp (— w 2 /2) . , 2/o\ 

#«(«) ^ F 2 ^ exp (-w /2). 

The rest of Lemma 1 follows on noting that $(w) = $,.(— u). 

Proof of Theorem 1: Let R (0 < R < C) and 8 satisfying (17) be 
given. Since y 1 has density g(a — b), and the {y { }" are independent, 

P(E,) = Pr fa; ^ 2/i - A) 

1-3 

f" . . , , „ /at least one 1 /OKN 

dag(a - b) Pr U {»/ ^ <* - A}. 

-oo ;-2 

Now since the y t (j > 1) have density g(a), 

u (^ 

Pr U \Vi ^ « - A} g | (M _ 1} pr ^ ^ a _ A} g M< j, c(a _ A ). 

(26) 

Letting a be a parameter to be specified later, we break the integral of 
equation (25) into two parts, a ^ a and a ^ a. We then apply the first 
upper bound of (26) in the first part, and the second bound of (26) 
in the second part. Thus 

P(Ei) ^ / 9(a - b) da + M J g(a - b)$ e (<* - A) da. 

If we assume that 

a ^ A, (27) 
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we can use the bound of Lemma 1 on <t> c (a - A) and obtain 

/a /*°° 

gift - b)da + M gift - b) exp [-(a - A) /2] da 
-00 " ° 

= P, + MP 2 . ( 28 ) 

We now overbound Pi and P 2 . First, 

p t = / g( a - b) da = / g(a) d« = *(<* ~ 0) ■ 

J_oo •»— oo 

If we further assume that 

a ^ b, (29) 

we can use Lemma 1 and obtain 

P, ^ exp[-(b-a) 2 /2]. (30) 

Second, 

P 2 - £° ^ exp [-*(a - 6) 2 ] exp [-*(« - A) 2 ] </« 

= f (2^ GXP [4 " ^)T eXP [_(6 " A)74] ^ 
= exp[-(b- A) 2 /4] 1 r exp ( _ y2/2) f/? , 

-y/2 (27r) J V2[a-(6 + A)/21 

If we now make a third assumption that 

a § ^ , (31) 

we can use Lemma 1 again (and 2 _i ^ 1) to bound P 2 : 

P 2 ^ exp [-(6 - A) 2 /4] exp {-[a - (i±-±)j } ^ 

= exp [-(6 - a) 2 /2] exp [-(a - A) 9 /2]. 
Inserting the boimds on P, and P 2 into (28) , we obtain 

P(E>) ^ exp [-(& - a) 2 /2]{l + U exp [-(a - A) 2 /2]}, (33a) 
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where from (27), (29), and (31), 

A 
b + A 



^ a £ b. (33b) 



It remains to choose the parameter a. A good choice will probably 
result when the upper bound of (28) is differentiated with respect to a 
and the result set equal to zero: 

g(a - b) - Mg(a - b) exp [-(a - A) 2 /2 ]= 0, 

or 

M exp [-(a - A) 2 /2] = 1, (34a) 

or since M = exp (RT) and A = 5(2T) i , 

a = (Rl + 8)(2T)». (34b) 

Let us now verify that when < R < C, constraints (33b) are satisfied 
for this choice of a. Since R > 0, a ^ A. Further, since b = (2CT)*, 

_(i±^) = ({ _ [cl _ (4B)t]) [^2l] go> 

since 8 satisfies (17). Finally, from (17), 

6 - a = [C } - (fl* + 5)](2r) J ^ 0. 

Thus constraints (33b) are, in fact, satisfied. Thus from (34) and (33a) 
P(E t ) ^ 2 exp [-(C } - R> - SfT\, 

which is Theorem 1. 

Proof of Theorem 2: Let R (0 g R < C), 8 > C* - (4R)\ and 
0i , 2 satisfying equation (21) be given. Then 

M M 

P. = Pr U H \V< < Vi - A} ^ Z Pr H {».- < 0i - A}, 



or 



P. g¥Pr O {».- < ft - A}, J2; 2. 



The last inequality follows from the symmetry of the distributions of 
the y { (j ^ 2). Recalling that the density for y^ (j ^ 2) is g(a), and that 
the [Vi]? are independent, 
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i , f i \ j o J for a11 % ^ J 
= M I q{a) da Pr \ . 

= M [ g(a) da Pr H l»* < « - A}. 

•'-CO *f*i 



W, = a 



Again using the independence of the y, and the fact that the density of 
y, is g(a — b) we have 



IM-2 



Pr n iz/. < « - a i = I /_ »(« - 6 ) rfa jl_i_ M 9(a) da J 

= $(a - A - &)[#(« - A)]" -8 . 

Substituting, we obtain 

P. ^ M P g(a)H<* - A - &)[$(« - A)] M ~ 2 rfa. (35) 

Also note that 

[*(« - A)]"" 2 = [1 - *,(« - A)]""" 2 (36) 

^ exp[-(M - 2)* e (a - A)]. 

As in the proof of Theorem 1, we break the integral in (35) into two 
parts a ^ a and a ^ a, where a will be specified later. In the range 
a ^ a we overbound $(a - A - 6) by unity, and [$(a - A)] by 
(36). In the range a ^ a, we overbound [*(a - A)]*' -2 by unity. Thus 

P. ^ M [ g(a) exp [-(M - 2)$ c (« ~ A)] rfa 

•'-oo 

+ M f g{a)*(a - A - &) da = MP, + MP 2 . (37) 

We now overbound P x and P 2 • First, 

p ] = P ff ( a ) exp [-(M - 2)<f> e (a - A)] da 

£ exp [-(M - 2)* e (a - A)] f g(a) da (38) 

^ exp[-(il/ - 2)$ c (a - A)]. 

Second, if we assume that 

a ^ b + A (39) 
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we can write 

P 2 = J g(a)$(a - A - 6) da 

/A + 6 />oo 

g(a)$(a — A — 6) da + / g(a)$(a — A — b) da. 
•'A + 6 

In the first integral, a — A — b ^ 0, so that we may use Lemma 1 to 
bound* (a — A — 6). In the second integral, weoverbound<£(a — A — b) 
by unity. Thus 

Pi ^ f g(<*) exp [-(a - A - b) 2 /2] da + [ g(a) da 
J a Ja+& 

g r g(a) exp [-(a - A - 6)72] da 4- $ C (A + b). 

Since from (20) and the fact that R < C, 

A + b = (8 + C J )(2T) i > 2(C* - R i )(2T) i > 0, 

we can again use Lemma 1 to overbound <£ C (A + b). Using the definition 
of g(a), we have 

P 2 ^ [ ' 7^* exp (-a 2 /2) exp [-(a - A - b) 2 /2] da 

+ exp [-(A + b) 2 /2] 

= exp [-(6 4- A) 2 /4] -^ /f exp [-(a - ^^j da 

4- exp[-(A+ 6) 2 /2] 

r>-l/2 *co 

= exp [-(6 4- A)74] ^-a I exp (-if/2) dv 

(Z7T) J V2[o-(6 + A)/2] 

+ exp [-(A 4- 6)72] 

^ exp [-(6 + A)74]$ c [ V2(a - ^y^)] + exp [-(A 4- 6)72]. 

If we further assume that 

a ^ (6 4- A)/2, (40) 

then we can again employ Lemma 1 to bound $ c [v2(« — (6 4- A)/2)]. 
Hence 
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F a ^ exp [-(6 + A)'/4] exp |-[a - (M^) J } 

+ exp[-(A + 6) a /2] (41) 
= exp \-hW + (a - A - b) 2 ]\ + exp [-(A + bfffl. 
The difference between the second and first exponents in (41) is 

_i ((A + h y - [a 2 + (a - A - b) 2 }} = a[a - (A + &)] ^ 0, 
by (39) and (40). Thus, the first terra of (41) is not less than the 
second, and 

P 2 ^ 2 exp {-Ma 2 + (a - A - fa) 2 ]}. (42) 

Inserting the bounds on P x (38) and P 2 (42) into (37), we obtain 

P.SM exp [-(Af - 2)* f (a - A)] 

+ 2M exp {-Ma 2 + (a - A - b) 3 ]}, (43a) 

where from equations (39) and (40) , 

&-±-* ^ a =g b + A. (43b) 

It remains to choose the parameter a, and here we will simply state 
a good choice of a without giving a motivating argument. Let 

a = (J2» + 5 - a )(2T)* (44) 

(where 2 is the arbitrary parameter which was selected at the begin- 
ning of the proof) . We must verify that constraints (43b) are satisfied 
for this choice of a. First, since R < C and 6 2 > 0, 

b + A - a - (C* - R* + *.)(2T)» > 0. 
Thus a ^ 6 + A. Second, from equation (21b), 

a _ /6+ij = i {5 _ [c * _ (4B) *] _ 20,}(2T)* ^ 0, 

so that a ^ (6 + A)/2 and (43b) is satisfied. 

Now consider the second term in (43a). Direct substitution of (44) 
shows that this term is 

2M exp {-[(fl* + 5 - e 2 r + (C* - fl» + *.) 2 ]r}, 
a single exponential decay in T (as T -» »). Finally consider the 
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exponent of the first term of (43). Substituting (44), it is 

-{M - 2)*„(a - A) = -(exp (RT) - 2)3> C {[R* - 2 ](2T)*}. 

Making use of the asymptotic formula $ e (u) ^ (27rw) _i e" u,/2 as 
u — ► co (see p. 106 of Ref. 2), and letting T — > <» (and noting that from 
equation (21b), R i — 6 2 > 0), this exponent is asymptotic to 

(2T)\Ri~-e 2 )(2T)* eXp( + KT) > 

where K > 0. Thus the first term of equation (43a) decays to zero as a 
double exponential in T, very much more rapidly than the second term 
of equation (43a) . We can find a T sufficiently large so that the ratio of 
the first to second terms of equation (43a) ^ 6 Y . With T so chosen 

P. £ (1 + 002 exp {-[(R i 4- 5 - 2 ) 2 + (C* - & + 2 ) 2 - R]T] 

which is Theorem 2. 
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